Prediction of functional modules based on gene distributions in microbial genomes.
نویسندگان
چکیده
We present a computational method for prediction of functional modules that can be directly applied to the newly sequenced microbial genomes for predicting gene functions and the component genes of biological pathways. We first quantify the functional relatedness among genes based on their distribution (i.e., their existences and orders) across multiple microbial genomes, and obtain a gene network in which every pair of genes is associated with a score representing their functional relatedness. We then apply a threshold-based clustering algorithm to this gene network, and obtain modules for each of which the number of genes is bounded from above by a pre-specified value and the component genes are more strongly functionally related to each other than genes across the predicted modules. Particularly, when the module size is bounded by 130, we obtain 167 functional modules covering 813 genes for Escherichia coli K12, and 138 functional modules covering 731 genes for Bacillus subtilis subsp. subtilis str. 168. We have used the gene ontology (GO) information to assess the prediction results. The GO similarities among the genes of the same functional module are compared with the GO similarities among the genes that are randomly clustered together. This comparison reveals that our predicted functional modules are statistically and biologically significant, and the genes of the same functional module share more commonality in terms of biological process than in terms of molecular function or cellular component. We have also examined the predicted functional modules that are common to both Escherichia coli K12 and Bacillus subtilis subsp. subtilis str. 168, and provide explanations for some functional modules.
منابع مشابه
Prediction of functional modules based on comparative genome analysis and Gene Ontology application
We present a computational method for the prediction of functional modules encoded in microbial genomes. In this work, we have also developed a formal measure to quantify the degree of consistency between the predicted and the known modules, and have carried out statistical significance analysis of consistency measures. We first evaluate the functional relationship between two genes from three ...
متن کاملInference of phenotype-defining functional modules of protein families for microbial plant biomass degraders
BACKGROUND Efficient industrial processes for converting plant lignocellulosic materials into biofuels are a key to global efforts to come up with alternative energy sources to fossil fuels. Novel cellulolytic enzymes have been discovered in microbial genomes and metagenomes of microbial communities. However, the identification of relevant genes without known homologs, and the elucidation of th...
متن کاملThe in Silico Characterization of a Salicylic Acid Analogue Coding Gene Clusters in Selected Pseudomonas Fluorescens Strains
Background: The microbial genome sequences provide solid in silico framework for interpretation their drug-like chemical scaffolds biosynthetic potential. The Pseudomonas fluorescens species is metabolically versatile and producing therapeutically important natural products.Objectives: The main objective of the present study was to mine the publically available data of P. fluorescens stra...
متن کاملEvaluation of First and Second Markov Chains Sensitivity and Specificity as Statistical Approach for Prediction of Sequences of Genes in Virus Double Strand DNA Genomes
Growing amount of information on biological sequences has made application of statistical approaches necessary for modeling and estimation of their functions. In this paper, sensitivity and specificity of the first and second Markov chains for prediction of genes was evaluated using the complete double stranded DNA virus. There were two approaches for prediction of each Markov Model parameter,...
متن کاملDevelopment of joint application strategies for two microbial gene finders
MOTIVATION As a starting point in annotation of bacterial genomes, gene finding programs are used for the prediction of functional elements in the DNA sequence. Due to the faster pace and increasing number of genome projects currently underway, it is becoming especially important to have performant methods for this task. RESULTS This study describes the development of joint application strate...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Genome informatics. International Conference on Genome Informatics
دوره 16 2 شماره
صفحات -
تاریخ انتشار 2005